Voice Conversion Based on Trajectory Model Training of Neural Networks Considering Global Variance

نویسندگان

Naoki Hosaka

Kei Hashimoto

Keiichiro Oura

Yoshihiko Nankaku

Keiichi Tokuda

چکیده

This paper proposes a new training method of deep neural networks (DNNs) for statistical voice conversion. DNNs are now being used as conversion models that represent mapping from source features to target features in statistical voice conversion. However, there are two major problems to be solved in conventional DNN-based voice conversion: 1) the inconsistency between the training and synthesis criteria, and 2) the oversmoothing of the generated parameter trajectories. In this paper, we introduce a parameter trajectory generation process considering the global variance (GV) into the training of DNNs for voice conversion. A consistent framework using the same criterion for both training and synthesis provides better conversion accuracy in the original static feature domain, and the over-smoothing can be avoided by optimizing the DNN parameters on the basis of the trajectory likelihood considering the GV. Experimental results show that the proposed method outperforms the DNN-based method in term of both speech quality and speaker similarity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimating and modeling monthly mean daily global solar radiation on horizontal surfaces using artificial neural networks

In this study, an artificial neural network based model for prediction of solar energy potential in Kerman province in Iran has been developed. Meteorological data of 12 cities for period of 17 years (1997–2013) and solar radiation for five cities around and inside Kerman province from the Iranian Meteorological Office data center were used for the training and testing the network. Meteorologic...

متن کامل

Statistical singing voice conversion based on direct waveform modification with global variance

This paper presents techniques to improve the quality of voices generated through statistical singing voice conversion with direct waveform modification based on spectrum differential (DIFFSVC). The DIFFSVC method makes it possible to convert singing voice characteristics of a source singer into those of a target singer without using vocoder-based waveform generation. However, quality of the co...

متن کامل

Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks

We propose a parallel-data-free voice conversion (VC)method that can learn a mapping from source to target speech without relying on parallel data. The proposed method is generalpurpose, high quality, and parallel-data-free, which works without any extra data, modules, or alignment procedure. It is also noteworthy that it avoids over-smoothing, which occurs in many conventional statistical mode...

متن کامل

Implementation of Computationally Efficient Real-Time Voice Conversion

This paper presents an implementation of real-time processing of statistical voice conversion (VC) based on Gaussian mixture models (GMMs). To develop VC applications for enhancing our human-to-human speech communication, it is essential to implement real-time conversion processing. Moreover, it is useful to reduce computational complexity of the conversion processing for making VC applications...

متن کامل

Link Prediction using Network Embedding based on Global Similarity

Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Voice Conversion Based on Trajectory Model Training of Neural Networks Considering Global Variance

نویسندگان

چکیده

منابع مشابه

Estimating and modeling monthly mean daily global solar radiation on horizontal surfaces using artificial neural networks

Statistical singing voice conversion based on direct waveform modification with global variance

Parallel-Data-Free Voice Conversion Using Cycle-Consistent Adversarial Networks

Implementation of Computationally Efficient Real-Time Voice Conversion

Link Prediction using Network Embedding based on Global Similarity

عنوان ژورنال:

اشتراک گذاری